Goto

Collaborating Authors

 event camera


EV-Eye: Rethinking High-frequency Eye Tracking through the Lenses of Event Cameras

Neural Information Processing Systems

In this paper, we present EV-Eye, a first-of-its-kind large-scale multimodal eye tracking dataset aimed at inspiring research on high-frequency eye/gaze tracking. EV -Eye utilizes the emerging bio-inspired event camera to capture independent pixel-level intensity changes induced by eye movements, achieving sub-microsecond latency.


EV-Eye: Rethinking High-frequency Eye Tracking through the Lenses of Event Cameras

Neural Information Processing Systems

In this paper, we present EV-Eye, a first-of-its-kind large scale multimodal eye tracking dataset aimed at inspiring research on high-frequency eye/gaze tracking. EV-Eye utilizes an emerging bio-inspired event camera to capture independent pixel-level intensity changes induced by eye movements, achieving sub-microsecond latency. Our dataset was curated over a two-week period and collected from 48 participants encompassing diverse genders and age groups. It comprises over 1.5 million near-eye grayscale images and 2.7 billion event samples generated by two DAVIS346 event cameras. Additionally, the dataset contains 675 thousands scene images and 2.7 million gaze references captured by Tobii Pro Glasses 3 eye tracker for cross-modality validation. Compared with existing event-based high-frequency eye tracking datasets, our dataset is significantly larger in size, and the gaze references involve more natural eye movement patterns, i.e., fixation, saccade and smooth pursuit. Alongside the event data, we also present a hybrid eye tracking method as benchmark, which leverages both the near-eye grayscale images and event data for robust and high-frequency eye tracking. We show that our method achieves higher accuracy for both pupil and gaze estimation tasks compared to the existing solution.


Learning to Detect Objects with a 1 Megapixel Event Camera

Neural Information Processing Systems

Thanks to these characteristics, event cameras are particularly suited for scenarios with high motion, challenging lighting conditions and requiring low latency. However, due to the novelty of the field, the performance of event-based systems on many vision tasks is still lower compared to conventional frame-based solutions. The main reasons for this performance gap are: the lower spatial resolution of event sensors, compared to frame cameras; the lack of large-scale training datasets; the absence of well established deep learning architectures for event-based processing. In this paper, we address all these problems in the context of an event-based object detection task. First, we publicly release the first high-resolution large-scale dataset for object detection.


Exploiting Spatiotemporal Properties for Efficient Event-Driven Human Pose Estimation

Zhou, Haoxian, Xu, Chuanzhi, Chen, Langyi, Chen, Haodong, Chung, Yuk Ying, Qu, Qiang, Chen, Xaoming, Cai, Weidong

arXiv.org Artificial Intelligence

Human pose estimation focuses on predicting body keypoints to analyze human motion. Event cameras provide high temporal resolution and low latency, enabling robust estimation under challenging conditions. However, most existing methods convert event streams into dense event frames, which adds extra computation and sacrifices the high temporal resolution of the event signal. In this work, we aim to exploit the spatiotemporal properties of event streams based on point cloud-based framework, designed to enhance human pose estimation performance. We design Event Temporal Slicing Convolution module to capture short-term dependencies across event slices, and combine it with Event Slice Sequencing module for structured temporal modeling. We also apply edge enhancement in point cloud-based event representation to enhance spatial edge information under sparse event conditions to further improve performance. Experiments on the DHP19 dataset show our proposed method consistently improves performance across three representative point cloud backbones: PointNet, DGCNN, and Point Transformer.


High-Speed Event Vision-Based Tactile Roller Sensor for Large Surface Measurements

Khairi, Akram, Sajwani, Hussain, Alkilany, Abdallah Mohammad, AbuAssi, Laith, Halwani, Mohamad, Zaid, Islam Mohamed, Awadalla, Ahmed, Swart, Dewald, Ayyad, Abdulla, Zweiri, Yahya

arXiv.org Artificial Intelligence

Abstract-- Inspecting large-scale industrial surfaces like aircraft fuselages for quality control requires precise, high-resolution 3D geometry. Vision-based tactile sensors (VBTSs) offer high local resolution but require slow'press-and-lift' measurements for large areas. Sliding or roller/belt VBTS designs provide continuous measurement but face significant challenges: sliding suffers from friction/wear, while both are speed-limited by camera frame rates and motion blur . Thus, a rapid, continuous, high-resolution method is needed. We introduce a novel neuromorphic tactile roller sensor . It uses a modified event-based multi-view stereo algorithm for 3D reconstruction, leveraging high temporal resolution and motion blur robustness. This reconstruction is most effective for surfaces with distinct edges or sharp features, which are often the most critical for defect detection in industrial inspection tasks. We demonstrate 0.5 m/s scanning speeds with MAE below 100 µm (11x faster than prior methods). A multi-reference Bayesian fusion strategy reduces MAE by 25.2% (vs. Surface metrology and surface inspection are crucial elements in quality assurance across diverse industries, particularly aerospace and automotive manufacturing. Precise inspection is required to identify characteristics like paint quality, coating integrity, and subtle defects such as cracks, nicks, and dents [1], [2], [3]. Often, achieving a resolution of 0.1 mm or lower is necessary to accurately classify these features and ensure component integrity and safety [4]. Traditional contact-based methods, including high-precision profilometers [5], [6] or microscopic techniques [7], [8], [9], offer high resolution locally but become exceedingly time-consuming when applied to large surface areas due to their sequential, point-by-point or small-patch measurement nature. Non-contact optical methods, such as cameras, laser scanners, or structured light systems [2], [10], [11], [12], [13], [14], can significantly accelerate inspection by capturing data over wider areas. However, these methods often lack robustness; their performance can be compromised by variations in ambient lighting, motion blur when attempting high-speed scanning, or challenging surface optical properties like high reflectivity or transparency [15].


Ultralight Polarity-Split Neuromorphic SNN for Event-Stream Super-Resolution

Xu, Chuanzhi, Zhou, Haoxian, Chen, Langyi, Chung, Yuk Ying, Qu, Qiang

arXiv.org Artificial Intelligence

Event cameras offer unparalleled advantages such as high temporal resolution, low latency, and high dynamic range. However, their limited spatial resolution poses challenges for fine-grained perception tasks. In this work, we propose an ultra-lightweight, stream-based event-to-event super-resolution method based on Spiking Neural Networks (SNNs), designed for real-time deployment on resource-constrained devices. To further reduce model size, we introduce a novel Dual-Forward Polarity-Split Event Encoding strategy that decouples positive and negative events into separate forward paths through a shared SNN. Furthermore, we propose a Learnable Spatio-temporal Polarity-aware Loss (LearnSTPLoss) that adaptively balances temporal, spatial, and polarity consistency using learnable uncertainty-based weights. Experimental results demonstrate that our method achieves competitive super-resolution performance on multiple datasets while significantly reducing model size and inference time. The lightweight design enables embedding the module into event cameras or using it as an efficient front-end preprocessing for downstream vision tasks.


Count Every Rotation and Every Rotation Counts: Exploring Drone Dynamics via Propeller Sensing

Chen, Xuecheng, Xu, Jingao, Ding, Wenhua, Wang, Haoyang, Luo, Xinyu, Duan, Ruiyang, Chen, Jialong, Wang, Xueqian, Liu, Yunhao, Chen, Xinlei

arXiv.org Artificial Intelligence

As drone-based applications proliferate, paramount contactless sensing of airborne drones from the ground becomes indispensable. This work demonstrates concentrating on propeller rotational speed will substantially improve drone sensing performance and proposes an event-camera-based solution, \sysname. \sysname features two components: \textit{Count Every Rotation} achieves accurate, real-time propeller speed estimation by mitigating ultra-high sensitivity of event cameras to environmental noise. \textit{Every Rotation Counts} leverages these speeds to infer both internal and external drone dynamics. Extensive evaluations in real-world drone delivery scenarios show that \sysname achieves a sensing latency of 3$ms$ and a rotational speed estimation error of merely 0.23\%. Additionally, \sysname infers drone flight commands with 96.5\% precision and improves drone tracking accuracy by over 22\% when combined with other sensing modalities. \textit{ Demo: {\color{blue}https://eventpro25.github.io/EventPro/.} }